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I MEASUREMENT SYSTEM \ ^ 

The present invention concerns a method and 
apparatus for deriving information from physical events. 
There are many events from which it is desirable to have 
measurements but which are not susceptible to direct 
measurement or in which measurement of individual events 
can only be carried out with very great difficulty. A 
particularly important class of such events is the flow 
of particles in a stream of gaseous or liquid fluid. 
Systems of such particle flow include smoke in columns 
of air, unwanted particles in liquid systems such as high 
pressure hydraulic systems, and biological particles 
which can indicate the presence of bacteria or disease 
in urine. Measurement of all of these systems provides 
substantial problems in that the size of particles can 
vary, the velocity with which the particles are 
travelling can vary and the number of particles in any 
one unit of volume can also vary. Additionally the times 
of arrival of individual particles into the confines of 
the measurement apparatus cannot be predicted exactly and 
the shape and physical nature of the particles can vary. 
All these factors perturb the final measurement. 

Nevertheless the detection and measurement of 
particles in flowing systems is frequently of great 
importance. One of the examples already given relates 
to the measurement of bacteria in urine. Accurate 
measurement of the size of such bacteria particles can 
give very good indications as to the presence or not of 
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certain diseases. In high pressure hydraulic fluids 
involving filtering the breakdown of filters can cause 
catastrophic results and the measurement of particles in 
the hydraulic flow can provide an early indication as to 
5 the efficiency of the filter system. 

As a result of these demands for measurement systems 
a number of particular sizing techniques have been 
developed. Some of these are based on Doppler methods 
and require the interf erometric combination of crossed 

10 laser beams to create a structured pattern. This 
requires coherent laser light sources and precision 
lasers, or more recently the use of defraction gratings. 
The extent of the structured light field necessarily 
occupies a large part of the inspection volume and 

15 consequently requires quality optical components. An 
example of such a technique is disclosed in United States 
Patent No. 4854705. 

An example of a heuristic approach in which a more 
direct attempt is made to measure the individual sizes 

20 and velocities of particles in a flowing stream is 
described in International Patent Application No . 
W093/16368 . In this specification a flow of particles 
is passed through a cell and a structured monochromatic 
light field is projected into the cell. The particles 

25 pass transversely and successively through the spaced 
variations of the light field, the spacings of which are 
set in accordance with the expected range of particle 



5 



size. Variations in light intensity caused by the 
passage of the particles relative to the light field are 
detected and the size of a particle can be calculated by 
plotting the mean peak signal of the sensor as a function 
of the normalised peak-to-trough variation in the output 
pulses generated by the passages of the particle through 
the light field. such a system can be made in an 
extremely compact and relatively inexpensive manner but 
is not suitable for relatively large flow sizes where 
there are likely to be a substantial number of particles 
in the volume where the measurements are being made. 
Thus this system is not suited, for example, measuring 
the distribution of particles in the situation where it 
is required to provide measurements of smoke particles in 
a gas flow. 

Thus the present invention is concerned with 
providing a solution to the above problems and in 
particular a solution to the problem of providing 
accurate measurements of multiple physical events which 
are not directly observable. 

United States Patent Specification No US-A-5347541 
discloses Bayesian blind equalizer for use in digital 
communication comprising a plurality of parallel 
processors. Each processor in turn generates an 
estimated signal and an updated metric in order to be 
able to decode digital data despite intersymbol 
interference . 
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_ UK Patent Specification No GB-2209414-A discloses a-^ 

navigation system using a recursive estimator employing 
Bayesian logic • 

International Patent Specification No WO92/03905 
5 discloses a method and apparatus for optimally allocating 
resources and and discloses an iterative process 
^. utilisxng a probabilistic network in which each.--»e%e 

^ corresponds to a variable and each^corresponds to a 

constraint so that the topology of the network directly 
10 reflects the structure of the problem. The network is 
iterated until it reaches a stable state. 

United States Patent Specification No US-A-4661913 
discloses a flow apparatus through which unknown 
P^^^^l^s to be measured are passed, data generated by 
15 the passage of the particles stored, and this date is 
then compared with data detected from sample particles in 
order to clarify the unknown particles. 

In order that the present invention may be more 
readily understood an embodiment thereof will now be 
described by way of example and with reference to the 
accompanying drawings, in which: 

Figure 1 is a cross-section through an embodiment of 
a particle measurement system; 

Figure 2 illustrates a sample reading; 
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Figure 3 is a sectional plan view of an embodiment 
of a measurement device; 

Figure 4 is a circuit diagram of a processing 
circuit associated with the embodiment of Figure 3; 

Figure 5 is a print out of particle measurements 
carried out with the apparatus of Figures 3 and 4 ; 

Figures 6A and 5B are graphs of functions of which 
are of particular importance when the processing to be 
described hereinafter is carried out; and 

Figure 7 is a flow diagram setting out the 
computations carried out by the circuit of Figure 4. 

When considering the passage of an unknown particle 
through a light field it will be appreciated that the 
passage of the particle can have two main effects which 
can be detected by means of appropriate sensors . 
Firstly, the particle can obscure, that is directly 
interfere with the passage of the light, or secondly it 
can scatter the light. The detection of these effects 
is an indication of the presence of the particle but is 
in itself a poor indication of the size, velocity and 
shape of the particle. Factors which cannot be directly 
deduced from the detected light include the size of the 
particle r, its coordinates within the measurement area 
z,y, its velocity v, its shape el, and the exact time of 
its arrival T^. In most measurement systems and in 
particular in the embodiment being described, the main 
unknown of interest is particle size. 
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It will be appreciated that a direct measurement of 
size of a particle such as a particle of smoke or grit 
in a flow of oil is not practicable in a rapid, real-time 
manner. That is the event which is to be measured is to 

5 some extent unobservable . The present embodiment 

accordingly proposes a system which is based on the 
probability of the size of a particle causing a detected 
event. It will accordingly be appreciated that the 
following description is concerned with the processing 

10 of detected data so as to arrive at probabilities rather 
than direct measurements . 

Referring now to Figure 1 of the accompanying 
drawings, this shows in diagrammatic form the optical 
layout of apparatus for sizing particles in a fluid 

15 stream. The apparatus is generally indicated at 10 and 
comprises a light source 12, a flow cell 14 through which 
a fluid flows at 15 and a light detector 16. 

The light source 12 will be described in greater 
details hereinafter and is imaged by lenses indicated 

20 at 18 into the inspection volume of the flow cell 14. 
The light source 12 provides a series of spaced intensity 
peaks in the inspection volume transverse to the flow of 
particles to be measured through the inspection volume. 
In the present embodiment the light source provides three 

25 focused facets and the orientation of the long side of 
these facets is normal to the flow direction. The 
magnification of the light source is chosen so that the 
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separation (q) of the image bars of the lines approximate 
to the range of particle size to be measured. 

Light from the image volume in the flow cell is 
collected by lens 22 into the detector 16. The detector 
5 15 can be a PIN diode or an avalanche photodiode. 

Particles that traverse the focused light field are 
thus exposed to light from each facet or focused 
variation in light intensity in turn. The intensity of 
light detected is thus modulated with a frequency U/q, 

10 where U is a particle transverse velocity and with 
intensity given by the convolution of the particle 
scattering cross-section with the structured light image. 
A particle P of diameter D where D >> q effectively 
smears out the structure. Particles for which D < q 

15 partially resolve the structure and thus partially 
modulate the signal intensity and particles for which 
modulate the signal intensity and particles for which D 
<< q fully resolve the structure and display full 
modulation with intensity limited only by the detection 

20 noise limit. A typical output for a particle passing 
through the structured light field is shown in the graph 
of Figure 2. 

It will be appreciated as already stated that the 
signals detected by the sensor can never be an exact 
25 representation of particle size. Amongst other factors 
the shape and nature of the particle will also affect its 
impact on the light as will its co-ordinates of passage 
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relative to the focal plane. 

It is a general concern of the present invention 
to establish these probabilities with such a degree of 
certainty that they can be taken as actual measurements 



Referring now to Figure 3 of the accompanying 
drawings, this shows a particular version of the flow 
cell, light source and light detector shown in 
diagrammatic form in Figure 1. Thus the apparatus in 

10 Figure 3 comprises a pipe 21 through which liquid or gas 
can flow. Chambers 22 and 23 extend from diametrically 
opposed sides of pipe 21 and respectively house a light 
source in the form of LED 24 and a photodetector 25. 
Lenses 26 correspond to the lens 18 of Figure 1. In 

15 order to avoid problems caused by changing refractive 
indices in the fluid flowing through the pipe 21, the 
light is projected into the measurement volume via a 
curved window 27 . Structure is given to the light 
intensities in the measurement volume by means of a three 

20 bar grating 28 located in front of end of an optical 
fibre 24' coupled to LED 24 and shown in plan in Figure 
3A. It is of course possible that the structured light 
could be generated by an appropriately facetted LED. The 
light exiting the measurement volume passes via a curved 

25 window 29 similar to window 27 via a lens system 30 to 
the receiving light sensor 25, whereupon the output of 
the light sensor is taken to the control circuit shown 
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for practical purposes . 
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in Figure 4 of the accompanying drawings . 

Referring now to Figure 4 of the accompanying 
drawings, this shows the light sensor 25 connected to the 
input of the control circuit. This input is indicated 
at 40 and leads to an operational amplifier circuit 41, 
the output of which is taken to an analog-to-digital 
(ADC) converter 42, in turn connected to a data bus 43- 
The datastream coming from the ADC 4 2 is processed in a 
digital signal processor (DSP). A suitable processor is 
the TMS320C32 manufactured by Texas Instruments. Working 
area for the DSP 44 is provided by a RAM 4 5 and the 
circuit also includes a ROM 4 6 which can be accessed by 
the DSP 44 in order to carry out steps of the processing 
which will be described hereinafter. 

Apart from its major function the DSP 4 4 controls 
the OFF and ON switching of the LED 24 . In operation the 
LED emits continuously. 

Before the major function of the DSP 44 will be 
described, a minor function that it carries out is the 
calibration of the light pulses supplied by the LED 24 
to the measurement volume. The generation of these light 
pulses is controlled by signals on a line 47 to a 
digital-to-analog converter (DAC) 4 8 connected via an 
output 49 to the LED 24. The level of the output of LED 
24 is set by control signals on a line 50 also connected 
to DAC 48. At predetermined intervals the DSP 44 will 
cause the LED 24 to emit a series of short calibration 
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pulses, the presence of which, and level of which, will 
be detected by the sensor 25 and utilised to set the post 
ADC gain during nommal measurement. In the present 
embodiment the outputs of the DSP 4 4 are connected by 
5 serial links 51 to a host processor indicated at 52. 
This host processor may be a personal computer or a 
laptop or a specific terminal configured purely for 
carrying out the purposes of the present invention. 

Once the data has been finally processed by the host 

10 processor 22, it can be either displayed or printed for 
further use. Figure 5 shows a typical printout which may 
be obtained. This figure will be described in greater 
detail hereinafter . 

It will be appreciated that the computational 

15 process is carried out on intermittent streams of digital 
data leaving the ADC 42 caused by particles moving 
through the measurement volume through the light field 
generated by the DSP 44 intermittently actuating the LED 
24. The ADC acts by sampling the analog input supplied 

20 to it and in the present embodiment generates frames of 
data. For example, in the present embodiment an output 
similar to that shown in Figure 2A is sampled to generate 
a frame having 48 samples of 10 bit data. In the 
embodiment being described the computations required are 

25 such that this volume of data cannot be handled without 
exceptionally extensive data processing resources. 
Accordingly the present embodiment carries out a data 
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reduction step on the sampled data. In this data 
reduction step the 4 8 samples of a frame of interest are 
reduced to a simple 15 bit number D composed of quantised 
transformed log moments. This data reduction step is one 
5 of three important steps to be carried out during the 
measurement process which will be described in greater 
detail hereinafter. However, it will be appreciated that 
the actual sampling and reduction parameters can be 
varied. Additionally, the data reduction step is 

10 triggered by the detection that the energy of the signal 
being sampled is above a predetermined threshold and thus 
indicating the presence of an event, in this case a 
particle, passing through the measurement volume. 

The ROM 4 6 stores a look up table with estimated 

15 values for the probability P(D|0) for each value of D so 
obtained, and for 0, where 0 represents the particle 
radius and velocity parameters that may be more or less 
likely to have caused the observation of D. The 
generation of the look-up table is a second important 

20 step. 

The third step is an inferential processing step 
using Bayesian inference in which the distribution of 0 
is inferred from the sequence of D values observed and 
the prior distribution of 0. 
25 The computations and actions required to carry out 

these three steps will now be described in general terms, 
bearing in mind that the specific embodiment shown in 
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Figures 3 and 4 for measuring particles using structured 
light is only one embodiment. Thus there are many other 
systems which could produce analog or digital data from 
unobservable phenomena which could be analysed to produce 
5 valuable results by the same computational process. 

During the subsequent description the processing 
probabilities may be expressed in one of three forms, 
which are interconvertible and have different merits. 
The three forms are probability, log probability, and log 

10 odds. Probability and log probability are self- 

explanatory; if p is a probability, the corresponding 
log odds is the quantity log (p/(l-p)). If x is the log 
odds, the corresponding probability is e''/(l+e^). 
Another function which will be utilised in the subsequent 

15 description is the function f ( x ) =log ( 1+e^ ) . This 
function is frequently used during the processing to be 
described and is used in the majority of instances in 
situations where its value at a series of regularly 
spaced values of x is needed where the spacing interval 

20 is Xq* This function is shown in Figure 6A and a very 
close approximation can be obtained by storing a look-up 
table on an Xi=0,l interval grid and linearly 
interpolating. In the case where the interval Xq is a 
small multiple of x^ a series of values can be obtained 

25 and added to another array is only three cycles of a 
computational algorithm- The function f(x) can also be 
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utilised in adding a set: of numbers expressed as their 
logs. Thus given a=log(A) and b=log(B) it can be noted 
that log (A)+(B) = a+f(b-a) without the need to 
exponentiate or otherwise take logarithms • Repetition 
5 of this operation adds a whole set of such numbers 
yielding the logarithm of the sum. 

This function has two main uses. Firstly it can be 
used for converting from log odds to log probability. 
Thus if x=log(p/ ( 1-p) ) , then p=e''/ ( l+e"" ) , and log(p)=x- 

10 f(x). Secondly in carrying out conversions from log odds 
to log( 1 -probability ) . Thus if x=log(p/ { 1-p) ) , then 
log(l-p)=-f (x) . 

A second function of importance in the following 
calculations is shown in Figure 6B. This is the function 

15 h(x)=log( 1-e^) . This function is used in the range x<0 
only. It is less well behaved than f(x) in that it has 
an essential singularity and branch point at one end of 
the range. In the following computations it is used less 
frequently than f(x). The integer part of log ( -x ) can 

20 be extracted by reading the exponent part of the floating 
point representation of x. When h(x) is plotted against 
log2(-x) it is seen that only a short look up table is 
required for h(x) to cover small absolute values, with 
h(x) at large absolute values of log2(-x) being well 

25 approximated by log( 2 ) *log2( -x ) when logjCx) is negative 
and by zero when log2(-x) is positive. 
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Thus for different ranges of their function 



a) 



x<-20: 



h(x)=0 



-20<x<-0 . 1 : 



h(x) is obtained by interpolation in 



a table of interval 0 . 1 



5 



c) 



-0 . l<x: 



h(x) is obtained by interpolation in 



an effective table of entries at 



values of x of the form of -2"" for 



positive integer n, where the table 



values are actually given by the 



formula -nlog(2). 



In the following description this function has the 
following uses. Firstly computing log(l-p) from log(p). 
Thus if x=log(p) then log ( 1-p ) =h ( x ) . Secondly computing 
log odds from log probability. Thus if x = log(p) then 



15 log(p/(l-p) )=x-h(x) . 

Having now set out what are in essence the building 
blocks of the processes to be described hereinafter there 
will now be given a description in general terms of the 
computational procedures carried out by the DSP 44, and 

20 in particular to the three steps which have already been 
referred to. 

The inferential computations will now be described 
in a general form which is nevertheless applicable to the 
embodiment of Figures 3 and 4 . 
25 Thus for the measurement of particle radii there is 

provided a set of bins of particle radius values centred 
on the values r^ for i = l,...,Ii. These bins are of 
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course dependent on the expected range of particle values 
for a particular application. 

Similarly a set of bins are provided for the 
measurement of particle velocity centred on the values 
5 v^ for 1=1, ...,I2, which are similarly dependent on 
the particular application. 

For each combination of x* and v there is then an 
event type E^, where 1 < i < I1I2/ which indicates that a 
particle size of ir and velocity v passed through the 

10 sensitive volume for that particle size during a 
particular frame. The type of Dj that results depends on 
the noise present and on the z and y coordinates with 
which the particle transited, but is governed by the 
probabilities P(Dj\ E^) , The probability of each E^ 

15 occurring depends on the fluid being assessed, and for 
any particular fluid is given by - 

Nonetheless the invention is more generally 
applicable to situations where the non-directly- 
observable events E^ are not related to particles or 

20 their radius and velocity. 

Given any particular fluid of "homogeneous" particle 
concentration in which the particles to be measured are 
carried there is associated with it a set of 
probabilities pi that when passed through the particle 

25 counter for one frame, which in the present embodiment 
is an interval of 62 ^s corresponding to the maximum time 
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of transit of the largest expected particle at the lowest 
expected velocity, a particle of radius in the bin 
centred on ri passes through the predefined "active 
volume" of the sensor, where the active volume may depend 
5 on i. It is assumed that there is negligible probability 
of two such events occurring in the same frame, and that 
one of the ri, say ro, is zero, representing the 
possibility that there is no particle in the frame. 
Clearly then Epi=l . 

10 Now, when a given fluid with assumed stationary 

input particle distribution (and hence fixed vector p 
whose components are the p^) is connected to the sensor 
it is possible to have some prior idea what sort of 
values the vector p is likely to take. This prior idea 

15 is expressed in the form of a prior distribution on p, 
which is denoted P(p) . This distribution will have the 
property that for p whose components do not sum to 1 , 
P(p) will be zero, while the I-l dimensional integral 
JP(p)dp=l. It may be broad and flat indicating that 

20 there are no preconceived ideas as to what sort of fluid 
is to be looked at, or it may be restricted and narrow 
indicating that there is already a pretty good idea of 
what sort of fluid it is because it has already been 
processed in some way and experience has been gathered 

25 on what sort of fluid is to be examined. 

P(p) thus represents the input particle size 
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distribution distribution which as discussed does not 
represent an actual measurement. After each trigger (or 
indeed after some defined time period with no trigger) 
there is generated some data D (which here represents the 
5 calculated moments rather than the full data trace), and 
the first stage of the inference process has obtained 
the values of 



What is required in order to provide a measurement is to 
10 infer the posterior distribution of p, given the data, 
P(p|D), which will then become the prior distribution for 
the processing of the next trigger or tick. When the 
measurement process is over the final posterior 
distribution after the last trigger was processed is the 
15 final posterior particle size distribution that is 
^ available for display in whatever complete or simplified 
way is required; it will reflect not only what the 
concentration of particles at each bin size is, but also 
the degree of certainty with which the results are 
20 presented. It is this inference of the posterior 
distribution of p from the prior distribution of p for 
each trigger that is the task of the update process to 
be described . 




log 




For the purposes of this embodiment it is always 
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considered that the original prior distribution before 
any data is examined will be a Dirichlet distribution, 
given by 



where all (x^>0 , because (a) it is easy to work with, (b) 
it is the natural class of distributions that arises from 
already having made some measurements on the fluid in 
question having previously had no idea about it, and (c) 

10 it allows all the likely types of priors which may be of 
interest by suitable variation of the parameters ol^. For 
actual determination of a suitable set of ot^ for a 
particular application either a guess may be taken based 
on intuition (including the totally flat prior (ot^^l for 

15 all i), or a number of fluids can be assessed on the 
basis of a totally flat prior, and inference made on the 
using an appropriate Bayesian inference method. 

One of the particular points about a Dirichlet that 
is helpful is that the complete distribution P(p) is 

20 defined by its marginal distributions P(Pi) - Note that 
the Dirichlet is not a separable distribution; it is not 
stated that P(p)=f]P(Pi) . It suffices rather, given a 
Dirichlet distribution, to record only the marginal 




wo 99/41662 



PCT/GB99/00488 



distributions to define the whole distribution with 
potential savings of memory. The difficulty that 
prevents the simple recording of the ot^ (with a huge 
saving in memory and computation), is that the first 
5 stage inference is uncertain - complete certainty in the 
first stage inference would allow only recording the <x^, 
as the distribution remains a Dirichlet distribution 
after each trigger is processed. In this instance, it 
becomes instead a mixture of an increasingly large number 

10 of Dirichlet distributions; yet although recording of 
the is no longer sufficient, recording of the marginal 
distributions P(Pi) is sufficient given an approximation 
described hereinafter . 

Thus, there exists an unknown probability vector P, 

15 with components (Pi)i-i,...,if such that 



7 



J. = 1 



corresponding to a set of I mutually exclusive events 

(Ei)i-i I which are not directly observable, one of which 

occurs in each time period Ei with the probability . 
20 Each of these events corresponds, for example, to the 
passage of a particle through the structured light field 
shown in Figure lA. There is also provided a known 
matrix A, and a set of mutually exclusive observable 
events (Dj ) one of which occurs in each time 
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period. The matrix A has components (^i, j,..,i,j=i,..,j)i,=.i.. 
which are the values of the conditional probabilities 
P(Dj|Ei) = For any one time period, D may be assumed 

to be conditionally independent of p given E, and E for 
5 one time period may be assumed to be conditionally 
independent of E for any other time period given p* 

As already stated in order to carry out meaningful 
calculations it is also necessary to have a prior set 
distribution P(p) on p. The purpose of the computation 

10 to be described in the following is to take this prior 
set distribution, in general terms P(p) and in effect 
carry out an iterative process on each subsequent set of 
measurements so as to eventually reach the most probable 
distribution of the events that have occurred. Thus the 

15 underlying task is to infer the posterior distribution 
P(P|D), where D =(Dj^ )k=i,...,K is the sequence of observed 
events. In particular, the processing is directed 
towards identifying its marginal distributions P(Pi|D). 
Thus the present embodiment carries out an algorithm 

20 which updates the prior marginal distributions P(Pi) to 
get the posterior marginal distributions P(Pi|Dji) that 
result after a single time period. This step can then 
be repeated once for each time period, the posterior from 
each time period being used as the prior for the next. 

25 This algorithm involves the set of equations listed 

in Appendix A as set 1. 
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All equations or lists of equations referred to by 
numbers in this specification are set out in Appendix A 
attached to the end of this specific description. 

In the last line of the equations in set 1 the 
5 approximation has been made 

P(E^\E-,p^) - P{E^\E^) for h ^ i, 

which is an exact equality if the prior on p is a 
Dirichlet distribution. For arbitrary mixtures of 
Dirichlets this may be far from the truth. However, in 

10 practice the approximation can be shown to provide 
excellent results for the set of Dirichlets that arise 
in practise. The above derivations are continued as 
shown in the set of equations 2 of Appendix A. 
In the last equation of this set f ( x ) =log ( 1+e^ ) and K is 

15 some constant whose value is unimportant; it is known 
that JP(Pi I D)dpi=l , so the value of K can be determined 
later. In practice the stored values of K are set to 
keep the stored values of logP(Pi|D) in reasonable range. 
In practice the value of logP(Pi|D) cannot be stored 

20 for every possible value of p^, so there is chosen an 
appropriate set of values of p^ for which is remembered 
logP(Pi|D). Suppose there are M such points. Then the 
above expression has to be evaluated MI times for each 
time period. These evaluations are carried out by the 

25 DSP 44 of Figure 4. 
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By making one further approximation the computation 
can be so organised that each evaluation of MI needs take 
only slightly more than three CPU cycles on a processor 
such as the DSP 44 . 

In carrying out the evaluations it is noted that 
P(Dj I Ei)=ai^j. For each p^, there is selected a set of 
values such that the values of 



as k varies are uniformly spaced, and at any time the 
values of logP(Pi|D) for Pi=qi,k for each k are remembered. 

The function f above is implemented using 
interpolation in the look-up table stored in ROM 46; 

f is used to perform the summation over h in the 

log domain; 

the values of 



are only updated once every N time periods , for some 
integer N of the order of 20 (this effectively introduces 
a further approximation); 

The updates to P ( q^^,^ | D ) are combined for several 
successive time periods, in order to increase efficiency. 



log 



1 - P{E^) 
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In carrying out the above calculation some 
approximations have been made: 

1. P(p) is adequately represented by its marginals 
P(Pi) in the same way that a Dirichlet distribution is, 
5 to be precise, that P(Eh| E^, p^ ) = P(Eh| Ei ) for h ^ i. 

2- P(Pi) is adequately represented by its values 
on the points Pi=qu., where q;, are a set of probabilities 
whose corresponding log odds values a^ are evenly spaced 
over some interal; 
10 3. P(Ei) changes sufficiently slowly that for the 

purposes of updating P(Pil D) it may be reevaluated after 
every N time periods rather after every time period, 
where roughly speaking N < 20* 

In the previous section it was assumed that the 
15 values in the matrix A were known . This matrix A is 
effectively the look-up table (LUT) which is stored in 
ROM 4 6 of Figure 4 and is referred to in step S2 of the 
flow diagram of Figure 7. However, determining these 
values is one of the tasks to be performed in most 
20 situations where the methods of the previous section 
would be needed, and in particular in the particles sizer 
shown in Figures 3 and 4 . 

Two possible approaches for filling the look-up 
table are: firstly simulate the hardware in such a way 
25 that the normally unobservable events Ei can be caused to 
occur during each time period at will; use the 
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simulation to determine the value of Dj that occurs for 
each time period, and hence infer the maximum a 
postBriorl or alternatively the mean posterior values of 
the elements of A or secondly use the actual hardware 
5 under circumstances where the value of p is known (or, 
failing that, where a distribution of the possible values 
for p is known), collecting the numbers of each Dj that 
occur, and hence infer A as will be described. 

It is of course possible to provide hybrid methods 

10 which combine the features of each of the above steps. 
The best technique known at present is to set the 
prior distribution of A as Dirichlet as follows. Let a 
now be the probability vector for the various possible 
values of D that occur for a given E (which is now 

15 considered fixed). This is given a prior Dirichlet 
distribution by setting 




where all the ctj are equal, positive, and in turn have 
20 prior distribution set (for example) by P(cfc)=3e^ for 
some positive constant |5 of magnitude not far from 1 . 
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Let it be supposed that rij occurrences of Dj are observed 
during each time period in which E is true, and that 
N=Enj. Then the various alternative values that can be 
taken for aj are: 

a) The maximum likelihood value: aj=nj/N; 

b) The mean posterior value, which for any fixed value 
of a is given by 

ii^. + a 
^ Na ' 

and which for general alpha drawn from the prior 
may be determined using (for example) Markov Chain 
Monte Carlo methods ; 

c) The modal posterior value, which may be obtained by 
using any standard optimisation technique on the 
posterior distribution P(a,ot) =QP(a)P(a | ot) Pn | a,a) 
for some unimportant constant Q; 

d) The mean posterior value of a for the modal 
posterior value of a. 

The second approach is relatively laborious . 

It is required that the equipment be set up in such 
a way that (at different times) the probability vector 
p takes on a number of known values, say qk=(qk,i)- 

Let a^^j be the probability of getting Dj given E^, 
and nj^i^ be the number of times during the running of the 
kth setup that Dj is observed. Let A denote the matrix 
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consisting of the a^^j's. 

First a prior distribution on A has to be chosen. 
A recommendation is a Dirichlet of the same type used in 
the preceding section. 
5 The mean posterior value of A is then inferred, 

given the data N=(nj^v,). 

The likelihood of N given A is given by the set of 
equations 3 in Appendix A. Now, it is noted that P(A)=0 
unless (Vi) Eai^j=l . Monte Carlo Markov Chain Methods 

10 are used to determine the mean of the posterior, so 
random samples are drawn from P (A|N). In order to make 
this easier given the restrictions on A, there is written 
a^ j=bi^j/Ci, where c^=Eh^^^, and instead of sampling from 
P(A|N), sampling is carried out from P(A,c|N), where 

15 P(c I A,N)==P(c) which is Gaussian. This can be achieved 
by (in random order) Metropolis sampling from the 
conditional distributions 

and 

P{c^^,j\A,N,Q, (c^)^,j^) = P{c^o) ' 

20 for all possible values of i^ 

It is of course necessary to arrive at values for 
D to enable the foregoing calculation to be carried out. 
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^J^7 

In the present embodiment this involves the data 
reduction step already referred to and will now be 
described in greater detail. 

Thus the symbols in the above descriptions 

5 correspond in the actual implementation under 
consideration to the events "a particle of a particular 
size travelling at a particular velocity was present in 
the sensitive volume appropriate to that size and 
velocity during the time period under consideration", 

10 where the index i varies with the size and velocity in 
question, with one or more values of i indicating that 
no particle fulfilled these conditions during the time 
period under consideration. 

The actual events referred to as Dj above might in 

15 theory be the data signal recorded during a given time 
period. In practice, however, such a data signal 
contains too much information to process in the time 
available. The Dj observed are therefore a "digested" 
version of the said data signal. The method used to 

20 "digest" them in this particular implementation will now 
be described in greater detail. 

The particular "digestion" or data reduction method 
used is to take the data signal consisting of ADC samples 
y= (ym)m-i/ • • • f M/ make the calculations set in the set 

25 of equations 4 in Appendix A where T is an empirically 
determined matrix. Then the D value corresponding to y 




wo 99/41662 




PCT/GB99/00488 



2^2S 

is a quantised version of v. 

The design rationale being employed here is that the 
set of values Uj , U2 , U3 , u^, , . . . completely determine the set 
of values {Ytr - * *Yt^} i that the Uj* ,U2* /U3* ,u^, 

5 completely determine the set of values of {Zi,...z^}; 
both together go a long way towards determining the 
vector y. The manner in which this determination occurs 
is that the successive u„ values determine the successive 
derivatives at the origin of the spectrum of the (order 
10 independent) distribution of y„ values. If it is wanted 
to retain a small amount of information about y in a 
diffuse rather than local manner, u is one possible way 
of doing it . 

The purpose of T and t are to minimise the loss of 
15 information due to quantisation; T and t are chosen 
empirically for this purpose based on the range of values 
of V that are observed during typical (possibly 
simulated) data collection. 

Returning now to the flow diagram of Figure 7 it 
20 will be appreciated from the foregoing that two steps 
have to be taken before the actual calculations on real 
events can be carried out. These prior steps are shown 
in Figure 7 at SI and S2. At step SI the prior is set. 
As explained already the prior is in Dirichlet form or 
25 is a mixture of a well behaved set of Dirichlets. In 
step S2 the contents of the look-up table stored in ROM 



■^1 
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4 6 are calculated as already described. Once the prior 
has been set and the look-up table filled the actual 
process of measurement can begin using the apparatus 
described in general terms in Figures 1 and 2 and in 
5 greater details in Figures 3 and 4 . 

Thus at step S3 a sample is taken from the output 
of ADC 42 (Figure 4). This sampling is carried out at 
regular intervals which in the present embodiment is 
every 1/780000 of a second . Naturally this may well 

10 vary if different events are being measured. The sampled 
data from the ADC is supplied to a trigger in step S4 . 
In the present embodiment the trigger decides if the 
energy content in the sample is significant or not. If 
the energy content is significant, which may in the 

15 present embodiment corresponds to the passage of a 
particle through the measurement zone, the signal is 
reduced in the manner already described to provide a 15 
bit number D. This number is entered at step S5 in a 
data queue. The fact that no significant signals are 

20 occurring is also of interest. If the fluid flow is 
particularly clear then triggered events will be sparse. 
Thus every predetermined period a tick entry is entered 
in the data queue. This is done at step S6 . The 
resulting Queue is indicated at S7 . At step S8 the 

25 oldest entry is removed from the queue and at steps 9 the 
Bayesian inference step already described is carried out 
utilising initial marginals calculated in step 10 from 



so 
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the prior set in step 1 and with reference to the look-up 
table set up in step S2. As already described in step 
S9 the prior marginal distribution P(Pi) are updated to 
get the posterior marginal distributions P(Pi|Dji^) that 
5 result after a single time period. 

Essentially step S9 comprises two inferential 
subprocesses . The first of these is to access the LUT 
set up in step S2 . This access is dependent on the value 
O of the entry removed from the queue in step S8 . As 

yl 10 described in the part of the specification dealing with 
f==i= the compression of the data taken from the ADC, X^^^ can 

yj be expressed in terms of U2,U3,U2' and the values U2,U3 and 

2 U2' or alternatively their transformed quantized 

Q counterparts are used to determine whether the LUT hold 

15 the values of a^ or not. If for a value Dj^ U2' lies 
^ outside a particular range dependent on U2 and U3 then 

the LUT returns a standard set of equal values. 

This can be done if the data is stored in the LUT 
in compressed form as it has been found that a complete 
20 table actually contains many duplicated values. 

If U2' lies within the range then the set of 187 1- 
byte values for log odds of 

The decompressed log odds so obtained are then converted 
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to log probabilities using function f (x) shown in Figure 
5A, The second set of inferential processes is the 
Bayesian inference procedure already described in the 
present specification. The result of each step of this 
5 process is a marginal distribution which can be read out 
in steps 11 and displayed or printed and an example of 
which is shown in the graph of Figure 5. 

Each of the points on the graph refers to one of the 
bins for a particular range of particle radii already 

10 referred to in this description. Thus according to the 
graph over the period of the measurement the probability 
was that there were over 10-^ counts of particles per 
millilitre of the smallest size bin, 280 counts of 
particles per millilitre of the next sized bin and so on. 

15 It has to be emphasised that these are not actual counts 
but a representation of the probability distribution of 
the particles which caused events in the flow chart of 
Figure 5. As these points actually represent 
probabilities it does not mean that there were actually 

20 10^ particles in the first "bin" . In fact upper and 
lower confidence factors are also given to the points 
which will indicate that there may be fewer or more 
counts in each bin. However the actual graph is a very 
clear representation giving a considerable amount of 

25 information. The lower and upper ends of the bar 
indicates that the posterior probability of the 
concentration of particles in that bin being in the range 
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indicated is 0.9. This figure can be adjusted up or down 
giving a wider or narrower range. 

It will be appreciated that the above description has 
5 been given with regard to particles moving through a 
light beam so as to cause perturbations which are 
subsequently detected. However as already mentioned the 
statistical algorithm described in the present 
specification is capable of application in other fields. 

10 A simple example of this is that it might be desired, for 
management purposes, to measure the use to which a 
digital data link is being put without being allowed, for 
example for security of confidentiality reasons, to 
access the actual data being transmitted. Thus it might 

15 be required to measure how many of the packets 
transmitted over the link represent uncompressed video, 
how many uncompressed audio, how many MPEG compressed 
video, how many ASCII text and so on. As the data itself 
cannot be accessed the only data available to be used in 

20 making the measurements is the check sums and lengths of 
the packets . 

Following the procedure already described Ej can be 
allocated to the event that an uncompressed video packet 
25 was sent, E2 the event that an uncompressed audio packet 
was sent and so on for each of the packet types . 
Additionally Pi, P2 etc represent the true values of the 
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probabilities of the different types of packet being 
sent . 

It is now possible to form the matrix of probabilities 
5 P(Dj!Ei) = Aj, j by a study of the check sums and lengths 
of packets that have been transmitted by a known source. 
Each set of data, that is each check sum and length pair, 
Dj . can be observed and from this there can be inferred 
by the method already described the posterior 
10 distributions P(pi I Di,D2 ...D^). Thus the user can 
determine how likely it is that each packet transmitted 
is video, audio etc and how certain these probabilities 
are . 
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Pip..\D.)^ 



P{p^V\D.^ 1/7,) 




I numeraioi" clp_ 

= p^P^[p'.Ou^■^.^nB^p^ .p^D,,^E..P,w(E^pr,) ^^^^^ _ ^^^^^^ ..^^^ 

J numerator dp^ 
I numerator dp- 

P{p.ip(.D.^\E,)p, + {\-p,)Y^P{D.^\E,,%,p,)P{E,\E,,p.y 

_ V AW y 



J numerator dp. 



I numerator ^/p- 



(since {h^i /\ E,,)^ ) 



_ V AW y 

J numerator dp^ 
(since D is conditionally independent of p given E ) 

V AW y 

J numerator 



Pip.) 
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V [6: " r(E.) 

J luiincraior d/?^ 

nJp(D, I EM (I - P(D, I 

J numerator dp. 



(since (/; ^ i ^ E^J E. ) 



I numerator dp. 



(since multiplication of the numerator by a 
constant, in particular by ( 1 -P (E. )). 
changes nothing by virtue of the 
denominator) 



nP;)0~P.) 



P(DJE,)P{E,) \-P(E:) p_. 



J numerator dp^ 



Therefore 



log Pip. I D) = log + log(l - p, ) - / 



log— og ' +log 



P(DJ£,)P(£,) 
V ft.. 



a: 
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/^(A/lA)oc J^P((n-^). lA) 

=nri(z''(o,.£,iA)]" 
=nnfL«..°..,T" 



EQUATIONS 4 



_ 1 M 



M 



2 



M-1 
1 



/■i=i 



logKi 

V = Tu + t 



* /»f=i 
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